perception algorithm
Aria Gen 2 Pilot Dataset
Kong, Chen, Fort, James, Kang, Aria, Wittmer, Jonathan, Green, Simon, Shen, Tianwei, Zhao, Yipu, Peng, Cheng, Solaira, Gustavo, Berkovich, Andrew, Raina, Nikhil, Baiyya, Vijay, Oleinik, Evgeniy, Huang, Eric, Zhang, Fan, Straub, Julian, Schwesinger, Mark, Pesqueira, Luis, Pan, Xiaqing, Engel, Jakob Julian, Ren, Carl, Yan, Mingfei, Newcombe, Richard
The Aria Gen 2 Pilot Dataset (A2PD) is an egocentric multimodal open dataset captured using the state-of-the-art Aria Gen 2 glasses. To facilitate timely access, A2PD is released incrementally with ongoing dataset enhancements. The initial release features Dia'ane, our primary subject, who records her daily activities alongside friends, each equipped with Aria Gen 2 glasses. It encompasses five primary scenarios: cleaning, cooking, eating, playing, and outdoor walking. In each of the scenarios, we provide comprehensive raw sensor data and output data from various machine perception algorithms. These data illustrate the device's ability to perceive the wearer, the surrounding environment, and interactions between the wearer and the environment, while maintaining robust performance across diverse users and conditions. The A2PD is publicly available at projectaria.com, with open-source tools and usage examples provided in Project Aria Tools.
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots (0.70)
Advancement and Field Evaluation of a Dual-arm Apple Harvesting Robot
Zhu, Keyi, Lammers, Kyle, Zhang, Kaixiang, Arunachalam, Chaaran, Bhattacharya, Siddhartha, Li, Jiajia, Lu, Renfu, Li, Zhaojian
Apples are among the most widely consumed fruits worldwide. Currently, apple harvesting fully relies on manual labor, which is costly, drudging, and hazardous to workers. Hence, robotic harvesting has attracted increasing attention in recent years. However, existing systems still fall short in terms of performance, effectiveness, and reliability for complex orchard environments. In this work, we present the development and evaluation of a dual-arm harvesting robot. The system integrates a ToF camera, two 4DOF robotic arms, a centralized vacuum system, and a post-harvest handling module. During harvesting, suction force is dynamically assigned to either arm via the vacuum system, enabling efficient apple detachment while reducing power consumption and noise. Compared to our previous design, we incorporated a platform movement mechanism that enables both in-out and up-down adjustments, enhancing the robot's dexterity and adaptability to varying canopy structures. On the algorithmic side, we developed a robust apple localization pipeline that combines a foundation-model-based detector, segmentation, and clustering-based depth estimation, which improves performance in orchards. Additionally, pressure sensors were integrated into the system, and a novel dual-arm coordination strategy was introduced to respond to harvest failures based on sensor feedback, further improving picking efficiency. Field demos were conducted in two commercial orchards in MI, USA, with different canopy structures. The system achieved success rates of 0.807 and 0.797, with an average picking cycle time of 5.97s. The proposed strategy reduced harvest time by 28% compared to a single-arm baseline. The dual-arm harvesting robot enhances the reliability and efficiency of apple picking. With further advancements, the system holds strong potential for autonomous operation and commercialization for the apple industry.
- North America > United States > Michigan > Ingham County > Lansing (0.05)
- North America > United States > Michigan > Ingham County > East Lansing (0.05)
- North America > United States > California (0.04)
Instance Performance Difference: A Metric to Measure the Sim-To-Real Gap in Camera Simulation
--In this contribution, we introduce the concept of Instance Performance Difference (IPD), a metric designed to measure the gap in performance that a robotics perception task experiences when working with real vs. By pairing synthetic and real instances in the pictures and evaluating their performance similarity using perception algorithms, IPD provides a targeted metric that closely aligns with the needs of real-world applications. We explain and demonstrate this metric through a rock detection task in lunar terrain images, highlighting the IPD's effectiveness in identifying the most realistic image synthesis method. The metric is thus instrumental in creating synthetic image datasets that perform in perception tasks like real-world photo counterparts. In turn, this supports robust sim-to-real transfer for perception algorithms in real-world robotics applications.
- North America > United States > Wisconsin > Dane County > Madison (0.15)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (2 more...)
Acceleration method for generating perception failure scenarios based on editing Markov process
With the rapid advancement of autonomous driving technology, self-driving cars have become a central focus in the development of future transportation systems. Scenario generation technology has emerged as a crucial tool for testing and verifying the safety performance of autonomous driving systems. Current research in scenario generation primarily focuses on open roads such as highways, with relatively limited studies on underground parking garages. The unique structural constraints, insufficient lighting, and high-density obstacles in underground parking garages impose greater demands on the perception systems, which are critical to autonomous driving technology. This study proposes an accelerated generation method for perception failure scenarios tailored to the underground parking garage environment, aimed at testing and improving the safety performance of autonomous vehicle (AV) perception algorithms in such settings. The method presented in this paper generates an intelligent testing environment with a high density of perception failure scenarios by learning the interactions between background vehicles (BVs) and autonomous vehicles (AVs) within perception failure scenarios. Furthermore, this method edits the Markov process within the perception failure scenario data to increase the density of critical information in the training data, thereby optimizing the learning and generation of perception failure scenarios. A simulation environment for an underground parking garage was developed using the Carla and Vissim platforms, with Bevfusion employed as the perception algorithm for testing. The study demonstrates that this method can generate an intelligent testing environment with a high density of perception failure scenarios and enhance the safety performance of perception algorithms within this experimental setup.
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
The POLAR Traverse Dataset: A Dataset of Stereo Camera Images Simulating Traverses across Lunar Polar Terrain under Extreme Lighting Conditions
Hansen, Margaret, Wong, Uland, Fong, Terrence
Abstract-- We present the POLAR Traverse Dataset: a dataset of high-fidelity stereo pair images of lunar-like terrain under polar lighting conditions designed to simulate a straightline traverse. Images from individual traverses with different camera heights and pitches were recorded at 1 m intervals by moving a suspended stereo bar across a test bed filled with regolith simulant and shaped to mimic lunar south polar terrain. Ground truth geometry and camera position information was also recorded. This dataset is intended for developing and testing software algorithms that rely on stereo or monocular camera images, such as visual odometry, for use in the lunar polar environment, as well as to provide insight into the expected lighting conditions in lunar polar regions. The lunar south polar region is of particular interest to upcoming NASA missions such as the Volatiles Investigating Figure 1: Hardware setup extended over SSERVI test bed with Polar Exploration Rover (VIPER) due to the existence of lunar terrain and lighting.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- North America > United States > Colorado > Jefferson County > Golden (0.04)
- (3 more...)
- Government > Space Agency (0.72)
- Government > Regional Government > North America Government > United States Government (0.72)
How Does Perception Affect Safety: New Metrics and Strategy
Zhang, Xiaotong, Chong, Jinger, Youcef-Toumi, Kamal
Perception serves as a critical component in the functionality of autonomous agents. However, the intricate relationship between perception metrics and robotic metrics remains unclear, leading to ambiguity in the development and fine-tuning of perception algorithms. In this paper, we introduce a methodology for quantifying this relationship, taking into account factors such as detection rate, detection quality, and latency. Furthermore, we introduce two novel metrics for Human-Robot Collaboration safety predicated upon perception metrics: Critical Collision Probability (CCP) and Average Collision Probability (ACP). To validate the utility of these metrics in facilitating algorithm development and tuning, we develop an attentive processing strategy that focuses exclusively on key input features. This approach significantly reduces computational time while preserving a similar level of accuracy. Experimental results indicate that the implementation of this strategy in an object detector leads to a maximum reduction of 30.091% in inference time and 26.534% in total time per frame. Additionally, the strategy lowers the CCP and ACP in a baseline model by 11.252% and 13.501%, respectively. The source code will be made publicly available in the final proof version of the manuscript.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > Italy (0.04)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)
Exploration of the Assessment for AVP Algorithm Training in Underground Parking Garages Simulation Scenario
Abdullah's study, as described in [1], compared the space Simulation test scenarios are an important part of helping utilization efficiency of diagonal, parallel, and perpendicular autonomous driving algorithms improve, but current simulation parking methods. The research findings concluded that scenarios are still limited to manual approaches. The perpendicular parking methods yield the highest number of ultimate goal of this project is to generate Autonomous parking spaces. This conclusion was drawn using a university Valet Parking (AVP) simulation test scenarios in underground as a specific example. The study summarized in [2] focuses garages with differentiated difficulty levels through reinforcement on smart parking solutions, emphasizing their significance learning, which will challenge the vehicle-side AVP in the context of urban growth and traffic congestion. This algorithms and ultimately improve the algorithmic test metrics.
POLAR3D: Augmenting NASA's POLAR Dataset for Data-Driven Lunar Perception and Rover Simulation
Chen, Bo-Hsun, Negrut, Peter, Liang, Thomas, Batagoda, Nevindu, Zhang, Harry, Negrut, Dan
We report on an effort that led to POLAR3D, a set of digital assets that enhance the POLAR dataset of stereo images generated by NASA to mimic lunar lighting conditions. Our contributions are twofold. First, we have annotated each photo in the POLAR dataset, providing approximately 23 000 labels for rocks and their shadows. Second, we digitized several lunar terrain scenarios available in the POLAR dataset. Specifically, by utilizing both the lunar photos and the POLAR's LiDAR point clouds, we constructed detailed obj files for all identifiable assets. POLAR3D is the set of digital assets comprising of rock/shadow labels and obj files associated with the digital twins of lunar terrain scenarios. This new dataset can be used for training perception algorithms for lunar exploration and synthesizing photorealistic images beyond the original POLAR collection. Likewise, the obj assets can be integrated into simulation environments to facilitate realistic rover operations in a digital twin of a POLAR scenario. POLAR3D is publicly available to aid perception algorithm development, camera simulation efforts, and lunar simulation exercises.POLAR3D is publicly available at https://github.com/uwsbel/POLAR-digital.
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Montana (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- (10 more...)
- Government > Space Agency (0.86)
- Government > Regional Government > North America Government > United States Government (0.86)
Introspective Perception for Mobile Robots
Rabiee, Sadegh, Biswas, Joydeep
Perception algorithms that provide estimates of their uncertainty are crucial to the development of autonomous robots that can operate in challenging and uncontrolled environments. Such perception algorithms provide the means for having risk-aware robots that reason about the probability of successfully completing a task when planning. There exist perception algorithms that come with models of their uncertainty; however, these models are often developed with assumptions, such as perfect data associations, that do not hold in the real world. Hence the resultant estimated uncertainty is a weak lower bound. To tackle this problem we present introspective perception - a novel approach for predicting accurate estimates of the uncertainty of perception algorithms deployed on mobile robots. By exploiting sensing redundancy and consistency constraints naturally present in the data collected by a mobile robot, introspective perception learns an empirical model of the error distribution of perception algorithms in the deployment environment and in an autonomously supervised manner. In this paper, we present the general theory of introspective perception and demonstrate successful implementations for two different perception tasks. We provide empirical results on challenging real-robot data for introspective stereo depth estimation and introspective visual simultaneous localization and mapping and show that they learn to predict their uncertainty with high accuracy and leverage this information to significantly reduce state estimation errors for an autonomous mobile robot.
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Greece > Ionian Islands > Corfu (0.04)
Camera simulation for robot simulation: how important are various camera model components?
Elmquist, Asher, Serban, Radu, Negrut, Dan
Modeling cameras for the simulation of autonomous robotics is critical for generating synthetic images with appropriate realism to effectively evaluate a perception algorithm in simulation. In many cases though, simulated images are produced by traditional rendering techniques that exclude or superficially handle processing steps and aspects encountered in the actual camera pipeline. The purpose of this contribution is to quantify the degree to which the exclusion from the camera model of various image generation steps or aspects affect the sim-to-real gap in robotics. We investigate what happens if one ignores aspects tied to processes from within the physical camera, e.g., lens distortion, noise, and signal processing; scene effects, e.g., lighting and reflection; and rendering quality. The results of the study demonstrate, quantitatively, that large-scale changes to color, scene, and location have far greater impact than model aspects concerned with local, feature-level artifacts. Moreover, we show that these scene-level aspects can stem from lens distortion and signal processing, particularly when considering white-balance and auto-exposure modeling.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Media > Photography (0.67)
- Leisure & Entertainment > Games > Computer Games (0.48)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)